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DETAILED ACTION 

Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

(e) the invention was described in a patent granted on an application for patent by another filed in the 
United States before the invention thereof by the applicant for patent, or on an international application 
by another who has fulfilled the requirements of paragraphs (1 ), (2), and (4) of section 371 (c) of this 
title before the invention thereof by the applicant for patent. 

2. Claims 1 to 6, 8, and 10 to 12 are rejected under 35 U.S.C. 102(a) as being 
clearly anticipated by Klakow ("Selecting articles from the language model training 
corpus"), 

3. Applicants cannot rely upon the foreign priority papers to overcome this rejection 
because a translation of said papers has not been made of record in accordance with 
37CFR1.55. See MPEP § 201.15. 



4. Claims 7 and 9 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Ramaswamy et al. 

Regarding independent claim 7, Ramaswamy et al. discloses a method of 
building language models for speech recognition, wherein: 
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"a text corpus part of a given first text corpus is gradually extended by one or 
various other text corpus parts of the first text corpus in dependence on text data of an 
application-specific text corpus to form a second text corpus that is iteratively extended 
until a predefined criterion achieves a substantially minimum value and in that the 
values of the language model are generated while the second text corpus from the last 
iteration is used" - language model constructor 50 reads linguistic units from seed 
corpus 10 and constructs an initial reference language model 80 from these linguistic 
units; once an initial reference language model 80 ("a first text corpus") is constructed, 
iterative corpus extractor 60 reads linguistic units ("one or various text corpus parts") 
from external corpus 20 and computes a relevance score for each linguistic unit in 
accordance with language model 80; an iterative language model building technique 
generates a final language model 90 ("a second text corpus") from a small, domain- 
restricted seed corpus 1 5 ("in dependent on text data of an application-specific text 
corpus") and a large, less restricted external corpus 20; the linguistic units in seed 
corpus 15 are all highly relevant to a common domain or field ("an application-specific 
text corpus"), and external corpus 20 contains text data that is less relevant to the 
domain of interest than the data within the seed corpus; final language model 90 is used 
in language processing applications (column 2, line 40 to column 3, line 63: Figures 1 
and 2); a minimized perplexity change is "a predefined criterion achieving a substantially 
minimum value"; threshold comparator 63 accepts only those linguistic units that are 
below a perplexity threshold (column 4, lines 15 to 54: Figure 3); model checker 70 
evaluates language model quality by using the size of the perplexity change; if the 
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perplexity score for the current iteration is higher than those from prior iterations, 
indicating a reduction in quality, then the language model building process may be 
considered complete (column 5, lines 6 to 34: Figure 4); thus, iterations continue until a 
predefined criterion, a size of perplexity change, is minimized. 

Regarding independent claim 9, Ramaswamyet a/, discloses a method of 
building language models for speech recognition, wherein: 

"a part of a given acoustic training material, which represents a multitude of 
speech utterances, is gradually extended by one or more parts of the given acoustic 
training material and in that the acoustic references of the acoustic model are formed by 
means of the accumulated parts of the given acoustic training material once a 
predefined criteria achieves a substantially minimum value" - language model 
constructor 50 reads linguistic units ("one or more parts of the given acoustic training 
material") from seed corpus 10 and constructs an initial reference language model 80 
from these linguistic units; once an initial reference language model 80 is constructed, 
iterative corpus extractor 60 reads linguistic units from external corpus 20 and computes 
a relevance score for each linguistic unit in accordance with language model 80, and 
incrementally increases the size of the initial reference language model 80 ("is gradually 
extended by one or more parts of the given acoustic training material"); an iterative 
language model building technique generates a final language model 90 ("the acoustic 
model") from a small, domain-restricted seed corpus 15 and a large, less restricted 
external corpus 20; the linguistic units in seed corpus 15 are all highly relevant to a 
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common domain or field, and external corpus 20 contains text data that is less relevant 
to the domain of interest than the data within the seed corpus; final language model 90 
is used in language processing applications (column 2, line 40 to column 4, line 7: 
Figures 1 and 2); minimizing perplexity change is "a predefined criterion achieves a 
substantially minimum value"; threshold comparator 63 accepts only those linguistic 
units that are below a perplexity threshold (column 4, lines 15 to 54: Figure 3); model 
checker 70 evaluates language model quality by using the size of the perplexity change; 
if the perplexity score for the current iteration is higher than those from prior iterations, 
indicating a reduction in quality, then the language model building process may be 
considered complete (column 5, lines 6 to 34: Figure 4); thus, iterations continue until a 
predefined criterion, a size of perplexity change, is minimized; implicitly, linguistic units 
are acoustic units in speech recognition. 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 1 , 2, 5/1 , 5/2, 6/5/1 , 6/5/2, 8, and 10 to 12 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Ramaswamy et ai in view of Bandara et al. 

Regarding independent claim 1 , Ramaswamy et al. discloses a method of 
generating a language model for speech recognition, wherein: 
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"a first text corpus is gradually [reduced] by one or more various text corpus parts 
in dependence on text data of an application-specific second text corpus until a final text 
corpus is obtained from iterations of [reductions] of the first text corpus according to 
predefined criterion achieving a substantially minimum value" - language model 
constructor 50 reads linguistic units from seed corpus 10 and constructs an initial 
reference language model 80 from these linguistic units; once an initial reference 
language model 80 ("a first text corpus") is constructed, iterative corpus extractor 60 
reads linguistic units ("one or various text corpus parts") from external corpus 20 and 
computes a relevance score for each linguistic unit in accordance with language model 
80; an iterative language model building technique generates a final language model 90 
from a small, domain-restricted seed corpus 15 ("in dependence on text data of an 
application-specific second text corpus") and a large, less restricted external corpus 20; 
the linguistic units in seed corpus 15 ("an application-specific second text corpus") are 
all highly relevant to a common domain or field, and external corpus 20 contains text 
data that is less relevant to the domain of interest than the data within the seed corpus 
(column 2, line 40 to column 3, line 63: Figures 1 and 2); a minimized perplexity change 
isJ'predefined criterion achieving a substantially minimum value"; threshold comparator 
63 accepts only those linguistic units that are below a perplexity threshold (column 4, 
lines 15 to 54: Figure 3); model checker 70 evaluates language model quality by using 
the size of the perplexity change; if the perplexity score for the current iteration is higher 
than those from prior iterations, indicating a reduction in quality, then the language 
model building process may be considered complete (column 5, lines 6 to 34: Figure 4); 
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thus, iterations continue until a predefined criterion, a size of perplexity change, is 
minimized; 

"in that the values of the language model are generated on the basis of the 
[reduced] final text corpus" - final language model 90 is used in language processing 
applications (column 2, line 40 to column 3, line 63: Figures 1 and 2). 

Regarding independent claim 1, Ramaswamy et al. discloses a method of 
building language models by iteratively increasing the size of a language model by 
adding units from a large external text corpus, where the added units are similar to 
linguistic units in a seed corpus. Thus, Ramaswamy et al. discloses gradually and 
iteratively increasing the size of the language model but omits gradually reducing the 
size of the language model by reductions. Still, one of ordinary skill in the art would 
recognize that the language model building method of Ramaswamy et al. might be 
reversed in order gradually to reduce the size of the language model instead of 
gradually increasing its size. That is, the large external text corpus 20 may be gradually 
reduced when linguistic units iteratively are compared to, and found to be different from, 
those in the seed corpus. Bandara et al. teaches a method for adapting the size of a 
language model in a speech recognition system, where an acoustic distance is 
calculated, and the contents of the language model are reduced with respect to acoustic 
distance. (Column 5, Lines 20 to 63: Figure 2) The stated advantage is the size of the 
language model is reduced, while retaining accuracy. (Column 3, Line 56 to Column 4, 
Line 24) It would have been obvious to one having ordinary skill in the art to reverse the 
language model building process of Ramaswamy et al. as suggested by Bandara et al. 
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for the purpose of reducing the size of the language model, while retaining recognition 
accuracy. 

Regarding claim 2, Bandara et ai discloses calculating the language model 
parameters based upon trigram, bigram, and unigram probabilities (column 2, lines 20 
to 67). 

Regarding claim 5/1 and 5/2, Ramaswamy etai discloses a test corpus ("test 
text") is used by model checker 70 to evaluate the language model quality, calling for 
further language building iterations, if necessary, until its quality is satisfactory (column 

3, lines 6 to 14; column 3, line 64 to column 4, line 7). 

Regarding claim 6/5/1 and 6/5/2, Ramaswamy et al. discloses iterative corpus 
extractor computes a relevance score based upon a perplexity measure relative to a 
threshold to determine how many linguistic units to add to the language model (column 

4, lines 7 to 54). 

Regarding independent claim 8, Ramaswamy et ai discloses a method of 
generating a language model for speech recognition, wherein: 

"acoustic training material representing a first number of speech utterances is 
gradually [reduced] until a predefined criterion.achieves a substantially minimum value 
by acoustic training material parts representing individual speech utterances in 
dependence on a second number of application-specific speech utterances" - language 
model constructor 50 reads linguistic units ("training material representing a number of 
speech utterances") from seed corpus 10 and constructs an initial reference language 
model 80 from these linguistic units; once an initial reference language model 80 ("a first 
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number of speech utterances") is constructed, iterative corpus extractor 60 reads 
linguistic units from external corpus 20 and computes a relevance score for each 
linguistic unit in accordance with language model 80, and incrementally increases the 
size of the initial reference language model 80; an iterative language model building 
technique generates a final language model 90 from a small, domain-restricted seed 
corpus 15 ("in dependence on a second number of application-specific speech 
utterances") and a large, less restricted external corpus 20; the linguistic units in seed 
corpus 15 are all highly relevant to a common domain or field, and external corpus 20 
contains text data that is less relevant to the domain of interest than the data within the 
seed corpus (column 2, line 40 to column 4, line 7: Figures 1 and 2); implicitly, linguistic 
units are acoustic units in speech recognition; a minimized perplexity change is "a 
predefined criterion achieves a substantially minimum value"; threshold comparator 63 
accepts only those linguistic units that are below a perplexity threshold (column 4, lines 
15 to 54: Figure 3); model checker 70 evaluates language model quality by using the 
size of the perplexity change; if the perplexity score for the current iteration is higher 
than those from prior iterations, indicating a reduction in quality, then the language 
.model building process may be considered complete (column 5, lines 6 to 34: Figure 4); 
thus, iterations continue until a predefined "criterion, a size of perplexity change, is 
minimized; 

"in that the acoustic references of the acoustic model are formed by means of the 
[reduced] acoustic training material" - final language model 90 is used in language 
processing applications (column 2, line 40 to column 3, line 63: Figures 1 and 2). 
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Regarding independent claim 8, Ramaswamy et al. discloses a method of 
building language models by iteratively increasing the size of a language model by 
adding units from a large external text corpus, where the added units are similar to 
linguistic units in a seed corpus. Thus, Ramaswamy et ai discloses gradually 
increasing the size of the language model but omits gradually reducing the size of the 
language model. Still, one of ordinary skill in the art would recognize that the language 
model building method of Ramaswamy etai might be reversed in order gradually to 
reduce the size of the language model instead of gradually increasing its size. That is, 
the large external text corpus 20 may be gradually reduced when linguistic units 
iteratively are compared to, and found to be different from, those in the seed corpus. 
Bandara et al. teaches a method for adapting the size of a language model in a speech 
recognition system, where an acoustic distance is calculated, and the contents of the 
language model are reduced with respect to acoustic distance. (Column 5, Lines 20 to 
63: Figure 2) The stated advantage is the size of the language model is reduced, while 
retaining accuracy. (Column 3, Line 56 to Column 4, Line 24) It would have been 
obvious to one having ordinary skill in the art to reverse the language model building 
process of Ramaswamy et al. as suggested by Bandara et al. for-the purpose of 
reducing the size of the language model, while retaining recognitiorTaccuracy. 

Regarding claims 10 and 1 1 , Ramaswamy et ai discloses a method for 
generating a language model and acoustic models of linguistic units in speech 
recognition. 
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Regarding claim 12, Ramaswamy et ai discloses a method of building language 
models for speech recognition, wherein a language model constructor 50 generates a 
final language model 90 ("final text corpus") from a relatively small seed corpus 15 ("a 
small test corpus") (column 2, lines 40 to 50: Figure 1). 

7. Claims 3, 4, 5/3, 5/4, 6/5/3, and 6/5/4 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Ramaswamy et ai in view of Bandara et ai as applied to 
claims 1 and 2 above, and further in view of Klakow ("Language-model optimization by 
mapping of corpora"). 

Concerning claim 3, Ramaswamy et a/, discloses calculating a relevance score, 
but omits a selection criteria of the equation. However, Klakow ("Language-model 
optimization by mapping of corpora") discloses mapping of training corpora by an n- 
gram perplexity criterion involving the equation. (Page 702, Left Column) This is stated 
to have the advantage of reduced perplexity for speech recognition applications. (Page 
701 ) It would have been obvious to one having ordinary skill in the art to apply the 
equation taught by Klakow ("Language-model optimization by mapping of corpora") as 
the relevance score of Ramaswamy et ai for the purpose of reducing perplexity in 
speech recognition applications. 

Concerning claim 4, Bandara et ai discloses calculating the language model 
parameters based upon trigram, bigram, and unigram probabilities (column 2, lines 20 
to 67). 
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Concerning claim 5/3 and 5/4, Ramaswamy et a/, discloses a test corpus ("test 
text") is used by model checker 70 to evaluate the language model quality, calling for 
further language building iterations, if necessary, until its quality is satisfactory (column 

3, lines 6 to 14; column 3, line 64 to column 4, line 7). 

Concerning claim 6/5/3 and 6/5/4, Ramaswamy et al. discloses iterative corpus 
extractor computes a relevance score based upon a perplexity measure relative to a 
threshold to determine how many linguistic units to add to the language model (column 

4, lines 7 to 54). 

Response to Arguments 

8. Applicants' arguments filed 10 September 2004 have been fully considered but 
they are not persuasive. 

Regarding the rejection of claims 1 to 6, 8, and 10 to 1 2 under 35 U.S.C. 102(a) 
as being clearly anticipated by Klakow ("Selecting articles from the language model 
training corpus"), Applicants argue the reference fails to disclose or suggest the 
limitation of "a predefined criterion achieves a substantially minimum value". This 
position is traversed. -_ _ , _ __ 

Klakow discloses the invention as claimed. On Page 1 695, "2. Selecting Articles 
from a Corpus", Klakow says the selection criterion to be used to judge which articles to 
remove involves minimizing the perplexity of the test data. The unigram perplexity is 
equivalent to the log-likelihood, where the change in the log-likelihood when article A is 
removed from the training corpus is given by AF, in Equation (1 ). (Pages 1695 to 1696: 
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2.1 The Unigram Selection Criterion) Thus, Klakow anticipates the limitation of "a 
predefined criterion achieves a substantially minimum value" by minimizing the 
perplexity of the test data. Identically, the Specification, Page 4, Lines 25 to 31 , and 
Page 5, Lines 26 to 28, says that the selection criterion is a minimal perplexity. 

Regarding the rejections of claims 7 and 9 under 35 U.S.C. 102(e) as being 
anticipated by Ramaswamy et a/., Applicants argue the references fail to disclose or 
suggest the limitation of "until a predefined criterion achieves a substantially minimum 
value". This position is traversed. 

Ramaswamy et al. discloses an iterative language building technique where 
change in perplexity is minimized. Threshold comparator 63 accepts only those 
linguistic units that are below a perplexity threshold. (Column 4, Lines 1 5 to 54: Figure 
3). Model checker 70 evaluates language model quality by using the size of the 
perplexity change. Model evaluator compares the perplexity score of the most recent 
iteration to perplexity scores from prior iterations. If the perplexity score for the current 
iteration is higher than those from prior iterations, indicating a reduction in quality, then 
the language model building process may be considered complete. If the current 
perplexity score is about the same as, or only slightly lower than prior scores, indicating 
no significant improvement in quality for the current iteration, then the language model 
building process may be considered complete. (Column 5, lines 6 to 34: Figure 4) That 
is, iterations continue if the perplexity score changes significantly in a manner that the 
perplexity is lowered, but the iterations stop if perplexity begins to increase or decrease 
only slightly. Thus, iterations continue until a predefined criterion, a size of perplexity 
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change, is minimized. Identically, the Specification, Page 4, Lines 25 to 31, and Page 
5, Lines 26 to 28, says that the selection criterion is a minimal perplexity. 

Regarding the rejections of claims 1 , 2, 5/1 , 5/2, 6/5/1 , 6/5/2, 8, and 1 0 to 12 
under 35 U.S.C. 103(a) as being unpatentable over Ramaswamy et al. in view of 
Bandara et al., Applicants argue the references fail to disclose all of the elements of the 
claims. Applicants state Ramaswamy et al. discloses a reference language model is 
generated based on a small seed corpus containing linguistic units relevant to a 
domain, whereas Bandara et al. discusses the contents of a language model with 
respect to an acoustic distance. 

However, Applicants' arguments amount to a general allegation that the claims 
define a patentable invention without specifically pointing out how the language of the 
claims patentably distinguishes them from the references. Applicants have not pointed 
out specifically what language of the claims is not met by the combination of 
Ramaswamy et al. in view of Bandara et al. 

Applicants are merely attacking the references individually without addressing 
the basis of the combination. One cannot show nonobviousness by attacking 
references individually where the rejections are based on combinations of references. 
See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981 ); In re Merck & Co., 800 
F.2d 1091 , 231 USPQ 375 (Fed. Cir. 1986). 

Therefore, the rejections of claims 1 to 6, 8, and 10 to 12 are rejected under 35 
U.S.C. 102(a) as being clearly anticipated by Klakow ("Selecting articles from the 
language model training corpus"), of claims 7 and 9 are rejected under 35 U.S.C. 102(e) 
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as being anticipated by Ramaswamy et al., of claims 1,2, 5/1, 5/2, 6/5/1,6/5/2, 8, and 
10 to 12 under 35 U.S.C. 103(a) as being unpatentable over Ramaswamy et al. in view 
of Bandara et al., and of claims 3, 4, 5/3, 5/4, 6/5/3, and 6/5/4 under 35 U.S.C. 103(a) 
as being unpatentable over Ramaswamy et al. in view of Bandara et al., and further in 
view of Klakow ("Language-model optimization by mapping of corpora"), are proper. 

Conclusion 

9. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .1 36(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lemer whose telephone number is (703) 308- 
9064. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (703) 305-9645. The fax phone 
number for the organization where this application or proceeding is assigned is (703) 
872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (703) 305- 
4700. 
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